(1) GENERA gJSffl 




SEQUENCE LISTING 



TION: 



(i) APPLICANT: LaVallie, Edward 
Racie, Lisa 



(ii) TITLE OF INVENTION: HUMAN SDF-5 PROTEIN AND COMPOSITIONS 



(iii) NUMBER OF SEQUENCES: 6 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: GENETICS INSTITUTE, INC. 

(B) STREET: 87 C AMBR I DG E PARK DRIVE 
<C) CITY: CAMBRIDGE 

(D) STATE: MA 

(E) COUNTRY: USA 

(F) ZIP : 02140 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
<B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/949,904 

(B) FILING DATE: October 15, 1997 

(C) CLASSIFICATION: 



(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: LAZAR, STEVEN R. 

(B) REGISTRATION NUMBER: 32,618 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 498-8260 

(B) TELEFAX: (617) 876-5851 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 027 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
GAATTCGGCC TTCATGGCCT AGCTCATTCT GCTCCCCCGG GTCGGAGCCC CCCGGAGCTG 60 



55 



CGCGCGGGCT TGCAGCGCCT CGCCCGCGCT CCTCCCGGTG TCCCGCTTCT CCGCGCCCCA 12 0 

GCCGCCGGCT GCCAGCTTTT CGGGGCCCCG AGTCGCACCC AGCGAAGAGA GCGGGCCCGG 180 

GACAAGCTCG AACTCCGGCC GCCTCGCCCT TCCCCGGCTC CGCTCCCTCT GCCCCCTCGG 240 

GGTCGCGCGC CCACGATGCT GCAGGGCCCT GGCTCGCTGC TGCTGCTCTT CCTCGCCTCG 3 00 

CACTGCTGCC TGGGCTCGGC GCGCGGGCTC TTCCTCTTTG GCCAGCCCGA CTTCTCCTAC 3 60 

AAGCGCAGCA ATTGCAAGCC CATCCCGGCC AACCTGCAGC TGTGCCACGG CATCGAATAC 42 0 

CAGAACATGC GGCTGCCCAA" CCTGCTGGGC CACGAGACCA TGAAGGAGGT GCTGGAGCAG 480 

GCCGGCGCTT GGATCCCGCT GGTCATGAAG CAGTGCCACC CGGACACCAA GAAGTTCCTG 540 

TGCTCGCTCT TCGCCCCCGT CTGCCTCGAT GACCTAGACG AGACCATCCA GCCATGCCAC 600 

TCGCTCTGCG TGCAGGTGAA GGACCGCTGC GCCCCGGTCA TGTCCGCCTT CGGCTTCCCC 6 60 

TGGCCCGACA TGCTTGAGTG CGACCGTTTC CCCCAGGACA ACGACCTTTG CATCCCCCTC 720 

GCTAGCAGCG ACCACCTCCT GCCAGCCACC GAGGAAGCTC CAAAGGTATG TGAAGCCTGC 7 80 

AAAAATAAAA ATGATGATGA CAACGACATA ATGGAAACGC TTTGTAAAAA TGATTTTGCA 840 

CTGAAAATAA AAGTGAAGGA GATAACCTAC ATCAACCGAG AT AC C AAAAT CATCCTGGAG 9 00 

ACCAAGAGCA AGACCATTTA CAAGCTGAAC GGTGTGTCCG AAAGGGACCT GAAGAAATCG 9 60 

GTGCTGTGGC TCAAAGACAG CTTGCAGTGC ACCTGTGAGG AGATGAACGA CATCAACGCG 1020 

CCCTATCTGG TCATGGGACA GAAACAGGGT GGGGAGCTGG TGATCACCTC GGTGAAGCGG 1080 

TGGCAGAAGG GGCAGAGAGA GTTCAAGCGC ATCTCCCGCA GCATCCGCAA GCTGCAGTGC 1140 

TAGTCCCGGC ATCCTGATGG CTCCGACAGG CCTGCTCCAG AGCACGGCTG ACCATTTCTG 12 00 

CTCCGGGATC TCAGCTCCCG TTCCCCAAGC ACACTCCTAG CTGCTCCAGT CTCAGCCTGG 12 60 

GCAGCTTCCC CCTGCCTTTT GCACGTTTGC ATCCCCAGCA TTTCCTGAGT TATAAGGCCA 132 0 

CAGGAGTGGA TAGC TGTTTT CACCTAAAGG AAAAGCCCAC CCGAATCTTG TAGAAATATT 13 80 

CAAACTAATA AAATCATGAA TATTTTTATG AAGTTTAAAA ATAGCTCACT TT AAAGC TAG 1440 

TTTTGAATAG GTGCAACTGT GACTTGGGTC TGGTTGGTTG TTGTTTGTTG TTTTGAGTCA 1500 

GCTGATTTTC ACTTCCCACT GAGGTTGTCA TAACATGCAA ATTGCTTCAA TTTTCTCTGT 1560 

GGCCCAAACT TGTGGGTCAC AAACCCTGTT GAGATAAAGC TGGCTGTTAT CTCAACATCT 162 0 

TCATCAGCTC CAGACTGAGA CTCAGTGTCT AAGTCTTACA ACAATTCATC ATTTTATACC 1680 

TTCAATGGGA ACTTAAACTG TTACATGTAT CACATTCCAG CTACAATACT TCCATTTATT 1740 
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AGAAGCACAT TAACCATTTC TATAGCATGA TTTCTTCAAG TAAAAGGCAA AAGATATAAA 



1800 



TTTTATAATT GACTTGAGTA CTTTAAGCCT TGTTTAAAAC ATTTCTTACT TAAC TTTTGC 1860 

AAATTAAACC CATTGTAGCT TACCTGTAAT ATACATAGTA GTTTACCTTT AAAAGTTGTA 1920 

AAAATATTGC TTTAACCAAC AC TGT AAAT A TTTCAGATAA ACATTATATT C TTGT AT AT A 1980 

AACTTTACAT CCTGTTTTAC CTAAAAAAAA AAAAAAAAAG CGGCCGC 2 027 
(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 295 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Leu Gin Gly Pro Gly Ser Leu Leu Leu Leu Phe Leu Ala Ser His 
15 10 15 

Cys Cys Leu Gly Ser Ala Arg Gly Leu Phe Leu Phe Gly Gin Pro Asp 
20 25 30 

Phe Ser Tyr Lys Arg Ser Asn Cys Lys Pro lie Pro Ala Asn Leu Gin 
35 40 45 

Leu Cys His Gly lie Glu Tyr Gin Asn Met Arg Leu Pro Asn Leu Leu 
50 55 60 

Gly His Glu Thr Met Lys Glu Val Leu Glu Gin Ala Gly Ala Trp lie 
65 70 75 80 

Pro Leu Val Met Lys Gin Cys His Pro Asp Thr Lys Lys Phe Leu Cys 
85 90 ' 95 

Ser Leu Phe Ala Pro Val Cys Leu Asp Asp Leu Asp Glu Thr lie Gin 
100 105 110 

Pro Cys His Ser Leu Cys Val Gin Val Lys Asp Arg Cys Ala Pro Val 
115 120 125 

Met Ser Ala Phe Gly Phe Pro Trp Pro Asp Met Leu Glu Cys Asp Arg 
130 135 140 

Phe Pro Gin Asp Asn Asp Leu Cys lie Pro Leu Ala Ser Ser Asp His 
145 150 155 160 

Leu Leu Pro Ala Thr Glu Glu Ala Pro Lys Val Cys Glu Ala Cys Lys 



57 



165 170 175 

Asn Lys Asn Asp Asp Asp Asn Asp lie Met Glu Thr Leu Cys Lys Asn 
180 185 190 

Asp Phe Ala Leu Lys lie Lys Val Lys Glu lie Thr Tyr lie Asn Arg 
195 200 205 

Asp Thr Lys lie lie Leu Glu Thr Lys Ser Lys Thr lie Tyr Lys Leu 
210 215 220 

Asn Gly Val Ser Glu Arg Asp Leu Lys Lys Ser Val Leu Trp Leu Lys 
225 230 235 240 

Asp Ser Leu Gin Cys Thr Cys Glu Glu Met Asn Asp lie Asn Ala Pro 
245 250 255 

Tyr Leu Val Met Gly Gin Lys Gin Gly Gly Glu Leu Val lie Thr Ser 
260 265 270 

Val Lys Arg Trp Gin Lys Gly Gin Arg Glu Phe Lys Arg lie Ser Arg 
275 280 285 

Ser lie Arg Lys Leu Gin Cys 
290 295 

INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 275 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Ser Ala Arg Gly Leu Phe Leu Phe Gly Gin Pro Asp Phe Ser Tyr Lys 
15 10 15 

Arg Ser Asn Cys Lys Pro lie Pro Ala Asn Leu Gin Leu Cys His Gly 
20 25 30 

lie Glu Tyr Gin Asn Met Arg Leu Pro Asn Leu Leu Gly His Glu Thr 
35 40 45 

Met Lys Glu Val Leu Glu Gin Ala Gly Ala Trp lie Pro Leu Val Met 
50 55 60 

Lys Gin Cys His Pro Asp Thr Lys Lys Phe Leu Cys Ser Leu Phe Ala 
65 70 75 80 
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Pro Val Cys Leu Asp Asp Leu Asp Glu Thr lie Gin Pro Cys His Ser 
85 90 95 

Leu Cys Val Gin Val Lys Asp Arg Cys Ala Pro Val Met Ser Ala Phe 
100 105 110 

Gly Phe Pro Trp Pro Asp Met Leu Glu Cys Asp Arg Phe Pro Gin Asp 
115 120 125 

Asn Asp Leu Cys lie Pro Leu Ala Ser Ser Asp His Leu Leu Pro Ala 
130 135 140 

Thr Glu Glu Ala Pro Lys Val Cys Glu Ala Cys Lys Asn Lys Asn Asp 
145 150 155 160 

Asp Asp Asn Asp lie Met Glu Thr Leu Cys Lys Asn Asp Phe Ala Leu 
165 170 175 

Lys lie Lys Val Lys Glu lie Thr Tyr lie Asn Arg Asp Thr Lys lie 
180 185 190 

lie Leu Glu Thr Lys Ser Lys Thr lie Tyr Lys Leu Asn Gly Val Ser 
195 200 205 

Glu Arg Asp Leu Lys Lys Ser Val Leu Trp Leu Lys Asp Ser Leu Gin 
210 215 220 

Cys Thr Cys Glu Glu Met Asn Asp lie Asn Ala Pro Tyr Leu Val Met 
225 230 235 240 

Gly Gin Lys Gin Gly Gly Glu Leu Val He Thr Ser Val Lys Arg Trp 
245 250 255 

Gin Lys Gly Gin Arg Glu Phe Lys Arg He Ser Arg Ser lie Arg Lys 
260 265 270 

Leu Gin Cys 
275 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
CATGGGCAGC TCGAG 15 
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(2) INFORMATION FOR SEQ ID NO : 5 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 4 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
CTGCAGGCGA GCCTGAATTC CTCGAGCCAT CATG 3 4 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 83 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : • 
CCCCTGTGGG TAGAACGAGG TTAAAAAACG TCTAGGCCCC CCGAACCACG GGGACGTGGT 60 
TTTCCTTTGA AAAACACGAT TGC 83 
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